Goto

Collaborating Authors

 Automobiles & Trucks


Tesla has begun testing driverless robotaxis in Austin ahead of June 12 launch, report says

Mashable

We now have a tentative launch date for Tesla's long-awaited robotaxi service in Austin, Texas: June 12. How long has Tesla been testing out these driverless vehicles that will soon be on the public streets of a major U.S. city? According to Tesla CEO Elon Musk, testing has been going on for "several days." "For the past several days, Tesla has been testing self-driving Model Y cars (no one in driver's seat) on Austin public streets with no incidents," Musk posted on his X account on Thursday. "A month ahead of schedule," Musk continued.


Interpretable Image Classification with Adaptive Prototype-based Vision Transformers

Neural Information Processing Systems

This method classifies an image by comparing it to a set of learned prototypes, providing explanations of the form "this looks like that." In our model, a prototype consists of parts, which can deform over irregular geometries to create a better comparison between images. Unlike existing models that rely on Convolutional Neural Network (CNN) backbones and spatially rigid prototypes, our model integrates Vision Transformer (ViT) backbones into prototype based models, while offering spatially deformed prototypes that not only accommodate geometric variations of objects but also provide coherent and clear prototypical feature representations with an adaptive number of prototypical parts. Our experiments show that our model can generally achieve higher performance than the existing prototype based models. Our comprehensive analyses ensure that the prototypes are consistent and the interpretations are faithful. Our code is available at https://github.com/Henrymachiyu/ProtoViT.


WindsorML: High-Fidelity Computational Fluid Dynamics Dataset For Automotive Aerodynamics

Neural Information Processing Systems

This paper presents a new open-source high-fidelity dataset for Machine Learning (ML) containing 355 geometric variants of the Windsor body, to help the development and testing of ML surrogate models for external automotive aerodynamics. Each Computational Fluid Dynamics (CFD) simulation was run with a GPU-native high-fidelity Wall-Modeled Large-Eddy Simulations (WMLES) using a Cartesian immersed-boundary method using more than 280M cells to ensure the greatest possible accuracy. The dataset contains geometry variants that exhibits a wide range of flow characteristics that are representative of those observed on road-cars. The dataset itself contains the 3D time-averaged volume & boundary data as well as the geometry and force & moment coefficients.


3dbb8b6b5576b85afb3037e9630812dc-Paper-Conference.pdf

Neural Information Processing Systems

The reliability of driving perception systems under unprecedented conditions is crucial for practical usage. Latest advancements have prompted increasing interest in multi-LiDAR perception. However, prevailing driving datasets predominantly utilize single-LiDAR systems and collect data devoid of adverse conditions, failing to capture the complexities of real-world environments accurately. Addressing these gaps, we proposed Place3D, a full-cycle pipeline that encompasses Li-DAR placement optimization, data generation, and downstream evaluations. Our framework makes three appealing contributions. 1) To identify the most effective configurations for multi-LiDAR systems, we introduce the Surrogate Metric of the Semantic Occupancy Grids (M-SOG) to evaluate LiDAR placement quality.


Dense Connector for MLLMs

Neural Information Processing Systems

Do we fully leverage the potential of visual encoder in Multimodal Large Language Models (MLLMs)? The recent outstanding performance of MLLMs in multimodal understanding has garnered broad attention from both academia and industry. In the current MLLM rat race, the focus seems to be predominantly on the linguistic side.


NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking 1,5

Neural Information Processing Systems

Benchmarking vision-based driving policies is challenging. On one hand, openloop evaluation with real data is easy, but these results do not reflect closedloop performance. On the other, closed-loop evaluation is possible in simulation, but is hard to scale due to its significant computational demands.


Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline

Neural Information Processing Systems

Current end-to-end autonomous driving methods either run a controller based on a planned trajectory or perform control prediction directly, which have spanned two separately studied lines of research. Seeing their potential mutual benefits to each other, this paper takes the initiative to explore the combination of these two well-developed worlds. Specifically, our integrated approach has two branches for trajectory planning and direct control, respectively. The trajectory branch predicts the future trajectory, while the control branch involves a novel multi-step prediction scheme such that the relationship between current actions and future states can be reasoned. The two branches are connected so that the control branch receives corresponding guidance from the trajectory branch at each time step. The outputs from two branches are then fused to achieve complementary advantages. Our results are evaluated in the closed-loop urban driving setting with challenging scenarios using the CARLA simulator. Even with a monocular camera input, the proposed approach ranks first on the official CARLA Leaderboard, outperforming other complex candidates with multiple sensors or fusion mechanisms by a large margin.


Score Distillation via Reparametrized DDIM

Neural Information Processing Systems

While 2D diffusion models generate realistic, high-detail images, 3D shape generation methods like Score Distillation Sampling (SDS) built on these 2D diffusion models produce cartoon-like, over-smoothed shapes. To help explain this discrepancy, we show that the image guidance used in Score Distillation can be understood as the velocity field of a 2D denoising generative process, up to the choice of a noise term. In particular, after a change of variables, SDS resembles a high-variance version of Denoising Diffusion Implicit Models (DDIM) with a differently-sampled noise term: SDS introduces noise i.i.d.


into a robustified action set

Neural Information Processing Systems

We illustrate the online optimization process of RCL in Figure 1. Finally, we discuss the performance of RCL. We consider the management of N batteries. This problem falls into SOCO based on the reduction framework described in [49]. We consider each problem instance as one day (T = 24 hours, plus an initial action).